-
Notifications
You must be signed in to change notification settings - Fork 458
[SwiftParser] Improve statement level recovery #3156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c706d4c
to
2ca55fe
Compare
_ = subparser.consumeAttributeList() | ||
hasAttribute = true | ||
} else if subparser.at(.poundIf) && subparser.consumeIfConfigOfAttributes() { | ||
subparser.skipSingle() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subparser.consumeIfConfigOfAttributes()
already skipped #if .... #endif
ef1bf91
to
dc4e479
Compare
base: .syntax, | ||
nameForDiagnostics: "declaration", | ||
parserFunction: "parseDeclaration" | ||
parserFunction: "parseDeclarationOrIfConfig" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parseDeclaration()
don't parse #if ... #endif
. Newly introduced parseDeclarationOrIfConfig()
just for DeclSyntax.parse(from:)
mutating func atStartOfDeclaration( | ||
isAtTopLevel: Bool = false, | ||
allowInitDecl: Bool = true, | ||
allowRecovery: Bool = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed allowRecovery
from atStartOfDeclaration()
/atStartOfStatement()
/atStartOfSwitchCase()
because all statement-level recoveries are now done by parseUnexpectedCodeDecl()
mutating func parseDeclaration(in context: DeclarationParseContext = .topLevelOrCodeBlock) -> RawDeclSyntax { | ||
// If we are at a `#if` of attributes, the `#if` directive should be | ||
// parsed when we're parsing the attributes. | ||
if self.at(.poundIf) && !self.withLookahead({ $0.consumeIfConfigOfAttributes() }) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block is basically hoisted to parseMemberBlockItem()
- To align with
parseCodeBlockItem()
scheme - Don't want to attach
;
toIfConfigDeclSyntax
let isProbablyVarDecl = self.at(.identifier, .wildcard) && self.peek(isAt: .colon, .equal, .comma) | ||
let isProbablyTupleDecl = self.at(.leftParen) && self.peek(isAt: .identifier, .wildcard) | ||
|
||
if isProbablyVarDecl || isProbablyTupleDecl { | ||
return RawDeclSyntax(self.parseBindingDeclaration(attrs, .missing(.keyword(.var)), in: context)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Factored out to atBindingDeclarationWithoutVarKeyword()
and move the handling to recoveryResult
above, aligning with atFunctionDeclarationWithoutFuncKeyword()
.
|
||
if atFunctionDeclarationWithoutFuncKeyword() { | ||
return RawDeclSyntax(self.parseFuncDeclaration(attrs, .missing(.keyword(.func)))) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already handled in recoveryResult
above.
if self.atStartOfStatement(preferExpr: true) || self.atStartOfDeclaration() { | ||
return false | ||
} | ||
if self.atStartOfLine && self.withLookahead({ $0.atStartOfSwitchCase() }) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously case
was a start of statement, but not anymore. This prevents the second case below is parsed as the returning expression.
case .foo:
return
case bar:
...
|
||
let item: RawCodeBlockItemSyntax.Item | ||
let attachSemi: Bool | ||
if self.at(.poundIf) && !self.withLookahead({ $0.consumeIfConfigOfAttributes() }) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inlined parseItem()
because we want to choose whether to attach ;
(attachSemi
) depending on what we parse.
} else { | ||
// Otherwise, eat the unexpected tokens into an "decl". | ||
item = .decl( | ||
RawDeclSyntax( | ||
self.parseUnexpectedCodeDeclaration(allowInitDecl: allowInitDecl, requiresDecl: false, until: stopCondition) | ||
) | ||
) | ||
attachSemi = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously in parseItem()
, we tried atStartOfStatement(allowRecovery: true)
and atStartOfDeclaration(allowRecovery: true)
for recoveries.
Now unexpected tokens at this position are always parsed as UnexpectedCodeDeclSyntax
until we find a decl, statement, or expression.
1 | func o() { | ||
2 | }👨👩👧👦} | ||
| |`- error: extraneous braces at top level | ||
| |`- error: unexpected braces in source file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message change is because previously the UnexpectedNodesSyntax
was stored as SourceFileSyntax.unexpectedBetweenStatementsAndEndOfFileToken
, but now it's UnexpectedCodeDeclSyntax
in SourceFileSyntax.statements
.
DiagnosticSpec( | ||
locationMarker: "2️⃣", | ||
message: "unexpected code '))' before macro" | ||
message: "unexpected code '))' in class" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and similar changes are because the unexpected code are now separate UnexpectedCodeDeclSyntax
in the statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So IIUC this parses as an attribute on a class with a macro member? The change from unexpected to a node seems fine to me diagnostic wise, there's not much difference in the "before" vs "in"
5ceb6e8
to
f9a97cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice cleanup!
DiagnosticSpec( | ||
locationMarker: "2️⃣", | ||
message: "unexpected code '))' before macro" | ||
message: "unexpected code '))' in class" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So IIUC this parses as an attribute on a class with a macro member? The change from unexpected to a node seems fine to me diagnostic wise, there's not much difference in the "before" vs "in"
), | ||
DiagnosticSpec(locationMarker: "2️⃣", message: "unexpected code in source file"), | ||
], | ||
fixedSource: """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the fixed source still have the #endif
and extra }
? Shouldn't both be unexpected and thus removed in the fixed source? Is the new UnexpectedDeclNode not being treated the same as unexpected?
I'm also unsure about closing the struct at the endif. It makes sense if there's a matching #if, but when there isn't... 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Diagnostics are conservative about removing user code. UnexpectedNodesSyntax
in general might contain important user code and we don't want to delete them in "fix-it all"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also unsure about closing the struct at the endif.
We can't know if there's }
after #endif
until we parse it.
IMO #endif
et al should always be stronger than }
, because, unlike {
or }
which editor often insert, the user probably intentionally wrote #endif
for some reason.
This specific test case might be controversial, but for example, for a source file like this:
struct MyStruct {
func foo() {}
// ...
#endif
struct Other {
....
}
This is probably missing }
before #endif
diagnostics: [ | ||
DiagnosticSpec( | ||
locationMarker: "1️⃣", | ||
// FIXME: "expected attribute name after '@'". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you file an issue for this rather than have the FIXME? Or add a specific test for the message, with the FIXME + a link to the issue? Same for the one below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
) | ||
} | ||
|
||
func testDeclSyntaxParseIfConfigAttr() throws { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding these 🙇
message: "expected declaration and ';' after attribute", | ||
fixIts: ["insert declaration and ';'"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we default to ;
instead of a newline for this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is modeled as
CodeBlockItem(
item: .decl(MissingDecl(...)),
semicolon: .semiColonToken(precence: .missing)
)
And ParseDiagnosticsGenerator
generates single diagnostic for consecutive missing nodes.
"consecutive statements ... insert newline or ';'" is a special diagnostics that is generated when the previous node was not missing.
Tests/SwiftParserTest/translated/ForwardSlashRegexSkippingInvalidTests.swift
Show resolved
Hide resolved
swiftlang/swift#84553 |
swiftlang/swift#84553 |
1 similar comment
swiftlang/swift#84553 |
Consider the following code:
Previously, this was parsed as a
FunctionDeclSyntax
with the unexpected code} @attr
attached asunexpectedBeforeFuncKeyword
. This was problematic because@attr
was effectively ignored. TheTokenPrecedence
-based recovery strategy couldn't handle this well, since@
is not a declaration keyword and doesn't by itself signal the start of a declaration.One option would be to check
atStartOfDeclaration()
after eachskipSingle()
and recognize@
as the start of a declaration. However, then the preceding}
would need to be attached to the first attribute in the attribute list, which would require threading recovery logic intoparseAttributeList()
and only applying it to the firstparseAttribute()
. This would make the code more complex and harder to maintain.This PR introduces a new syntax node
UnexpectedCodeDeclSyntax
parseUnexpectedCodeDeclaration()
.This simplifies recovery handling significantly.
Also:
#if
related recovery:#endif
et al at top-level is nowUnexpectedCodeDecl
instead of making everything after it "unexpected"#endif
et al closes code/member block with missing}
.parsePoundDirective()
. It now receives only one closure instead of 3.#sourceLocation(file: "<file>", line: 100) ;
;
is now diagnosed as;
-only statement, or unexpected;
separator depending on the position.}
or statements. E.g.